Skip to content

Make Rc<T>::deref and Arc<T>::deref zero-cost #132553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

EFanZh
Copy link
Contributor

@EFanZh EFanZh commented Nov 3, 2024

Currently, Rc<T> and Arc<T> store pointers to RcInner<T> and ArcInner<T>. This PR changes the pointers so that they point to T directly instead.

This is based on the assumption that we access the T value more frequently than accessing reference counts. With this change, accessing the data can be done without offsetting pointers from RcInner<T> and ArcInner<T> to their contained data. This change might also enables some possibly useful future optimizations, such as:

  • Convert &[Rc<T>] into &[&T] within O(1) time.
  • Convert &[Rc<T>] into Vec<&T> utilizing memcpy.
  • Convert &Option<Rc<T>> into Option<&T> without branching.
  • Make Rc<T> and Arc<T> FFI compatible types where T: Sized.

@rustbot
Copy link
Collaborator

rustbot commented Nov 3, 2024

r? @jhpratt

rustbot has assigned @jhpratt.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 3, 2024
@EFanZh EFanZh force-pushed the zero-cost-rc-arc-deref branch from b283c44 to ae36f44 Compare November 3, 2024 09:14
@rust-log-analyzer

This comment has been minimized.

@marmeladema
Copy link
Contributor

Would it potentially enable those types to have an ffi compatible ABI? So that they could be returned and passed directly from /to ffi function, like Box?

@rust-log-analyzer

This comment has been minimized.

@EFanZh
Copy link
Contributor Author

EFanZh commented Nov 3, 2024

Would it potentially enable those types to have an ffi compatible ABI? So that they could be returned and passed directly from /to ffi function, like Box?

I think in theory it is possible, at least for sized types, but I am not familiar with how to formally make it so.

@EFanZh EFanZh force-pushed the zero-cost-rc-arc-deref branch from ae36f44 to 0d6165f Compare November 3, 2024 11:21
@rust-log-analyzer

This comment has been minimized.

@EFanZh EFanZh force-pushed the zero-cost-rc-arc-deref branch from 0d6165f to 98edd5b Compare November 3, 2024 13:06
@rust-log-analyzer

This comment has been minimized.

@jhpratt
Copy link
Member

jhpratt commented Nov 3, 2024

r? libs

@rustbot rustbot assigned joboet and unassigned jhpratt Nov 3, 2024
@EFanZh EFanZh force-pushed the zero-cost-rc-arc-deref branch from 98edd5b to 8beb51d Compare November 4, 2024 16:29
@rust-log-analyzer

This comment has been minimized.

@EFanZh EFanZh force-pushed the zero-cost-rc-arc-deref branch from 8beb51d to d7879fa Compare November 4, 2024 17:26
@rust-log-analyzer

This comment has been minimized.

@EFanZh EFanZh force-pushed the zero-cost-rc-arc-deref branch from d7879fa to 317aa0e Compare November 4, 2024 18:40
@joboet
Copy link
Member

joboet commented Nov 7, 2024

@EFanZh Is this ready for review? If so, please un-draft the PR.

@EFanZh
Copy link
Contributor Author

EFanZh commented Nov 7, 2024

@joboet: The source code part is mostly done, but I haven’t finished updating LLDB and CDB pretty printers. The CI doesn’t seem to run those tests.

@joboet
Copy link
Member

joboet commented Nov 8, 2024

No worries! I just didn't want to keep you waiting in case you had forgotten to change the state.
@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 8, 2024
@EFanZh EFanZh force-pushed the zero-cost-rc-arc-deref branch 3 times, most recently from f243654 to 1308bf6 Compare November 11, 2024 18:35
@Kobzol
Copy link
Member

Kobzol commented Aug 7, 2025

For running perf. here, the PR needs to be rebased first.

@EFanZh
Copy link
Contributor Author

EFanZh commented Aug 7, 2025

I’ll fix the conflicts in a few days.

@wesleywiser wesleywiser removed the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Aug 7, 2025
@EFanZh EFanZh force-pushed the zero-cost-rc-arc-deref branch from 22c49c6 to 817b116 Compare August 9, 2025 16:19
@Kobzol
Copy link
Member

Kobzol commented Aug 9, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Aug 9, 2025

⌛ Trying commit 817b116 with merge ec7bf66

To cancel the try build, run the command @bors try cancel.

rust-bors bot added a commit that referenced this pull request Aug 9, 2025
Make `Rc<T>::deref` and `Arc<T>::deref` zero-cost
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 9, 2025
@rust-bors
Copy link

rust-bors bot commented Aug 9, 2025

☀️ Try build successful (CI)
Build commit: ec7bf66 (ec7bf66bfb634dca39e8b1a65c3a872c3cd13fd4, parent: ca77504943887037504c7fc0b9bf06dab3910373)

@rust-timer

This comment has been minimized.

@tgross35 tgross35 self-assigned this Aug 9, 2025
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ec7bf66): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
1.8% [0.3%, 14.7%] 14
Regressions ❌
(secondary)
0.7% [0.1%, 2.6%] 22
Improvements ✅
(primary)
-0.5% [-1.5%, -0.1%] 21
Improvements ✅
(secondary)
-1.2% [-3.0%, -0.0%] 19
All ❌✅ (primary) 0.4% [-1.5%, 14.7%] 35

Max RSS (memory usage)

Results (primary 1.0%, secondary 1.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.2% [1.3%, 6.1%] 4
Regressions ❌
(secondary)
3.6% [2.1%, 4.8%] 9
Improvements ✅
(primary)
-3.3% [-4.3%, -2.3%] 2
Improvements ✅
(secondary)
-3.5% [-6.4%, -2.0%] 4
All ❌✅ (primary) 1.0% [-4.3%, 6.1%] 6

Cycles

Results (primary 4.8%, secondary -0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.8% [2.0%, 12.7%] 5
Regressions ❌
(secondary)
2.8% [2.1%, 3.1%] 7
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-5.3% [-7.1%, -1.6%] 5
All ❌✅ (primary) 4.8% [2.0%, 12.7%] 5

Binary size

Results (primary 0.5%, secondary 0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.5% [0.0%, 2.2%] 80
Regressions ❌
(secondary)
0.6% [0.0%, 5.5%] 96
Improvements ✅
(primary)
-0.3% [-0.4%, -0.2%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [-0.4%, 2.2%] 82

Bootstrap: 463.988s -> 464.881s (0.19%)
Artifact size: 377.65 MiB -> 377.46 MiB (-0.05%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 9, 2025
@EFanZh EFanZh force-pushed the zero-cost-rc-arc-deref branch from 817b116 to d54c877 Compare August 10, 2025 01:36
@Kobzol
Copy link
Member

Kobzol commented Aug 10, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Aug 10, 2025

⌛ Trying commit d54c877 with merge 38a5d4f

To cancel the try build, run the command @bors try cancel.

rust-bors bot added a commit that referenced this pull request Aug 10, 2025
Make `Rc<T>::deref` and `Arc<T>::deref` zero-cost
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 10, 2025
@rust-bors
Copy link

rust-bors bot commented Aug 10, 2025

☀️ Try build successful (CI)
Build commit: 38a5d4f (38a5d4f4c9ded25243c2946db27e12396bfed6a4, parent: 915a766b2f9fd53a8cd7b1fad003d3f8e488ff4b)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (38a5d4f): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
2.2% [0.3%, 14.7%] 11
Regressions ❌
(secondary)
0.6% [0.1%, 2.6%] 22
Improvements ✅
(primary)
-0.5% [-1.5%, -0.2%] 28
Improvements ✅
(secondary)
-1.1% [-3.0%, -0.1%] 26
All ❌✅ (primary) 0.3% [-1.5%, 14.7%] 39

Max RSS (memory usage)

Results (primary 1.0%, secondary -0.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.2% [1.6%, 5.6%] 5
Regressions ❌
(secondary)
2.9% [2.6%, 3.1%] 4
Improvements ✅
(primary)
-2.6% [-3.3%, -2.0%] 3
Improvements ✅
(secondary)
-3.3% [-6.1%, -2.1%] 6
All ❌✅ (primary) 1.0% [-3.3%, 5.6%] 8

Cycles

Results (primary 5.1%, secondary -0.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
5.1% [2.1%, 13.3%] 4
Regressions ❌
(secondary)
2.4% [2.1%, 2.8%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.8% [-4.2%, -3.3%] 2
All ❌✅ (primary) 5.1% [2.1%, 13.3%] 4

Binary size

Results (primary 0.5%, secondary 0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.5% [0.0%, 2.2%] 77
Regressions ❌
(secondary)
0.6% [0.0%, 5.6%] 96
Improvements ✅
(primary)
-0.3% [-0.4%, -0.2%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [-0.4%, 2.2%] 79

Bootstrap: 463.478s -> 463.036s (-0.10%)
Artifact size: 377.38 MiB -> 377.46 MiB (0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 10, 2025
@tgross35
Copy link
Contributor

tgross35 commented Aug 13, 2025

The results here seem to be pretty mixed. Any idea why this seems to hurt some incremental builds? I suppose it could cause more work to do at build time, but the 14% increase building clif seems like something more substantial. Unless, is that code super Arc-heavy or something?

(cranelift-codegen is a benchmark testing building rustc_codgen_cranelift and not a benchmark of the time clif takes to do codegen, right?)

@scottmcm
Copy link
Member

Those icount results look entirely mergeable to me, at least. Notably, the checks are comfortably green, including 100% green in the primary benchmarks, which I find is often the best measure we have of how this impacts code at runtime. And IIRC there's a bunch of Arcs in the query system. (Sometimes doc can do that too, but here that's very quiet.) Then debug is mixed but still comfortably green, also 100% green in primary.

Opt isn't trivially-fine, but I think it's ok. Lots of them are incr anyway, which is relatively rare with opt. And the one big outlier in icount shuffled its codegen schedule massively:
image
So with that I think the icounts are misleading, and its wall time result is still within reasonableness: +1.93% but that's below the significance threshold since it looks like that happens a ton for it:
image

Since this makes Deref simpler and thus changes the inliner cost estimates for something quite common, it seems entirely plausible that this would thus permute the opt results a whole bunch.


I think I might actually be more curious about the binary size changes. Dropping 126KiB off rustc_driver.so looks great, but then image and cargo get materially bigger in binary size for opt-full, which seems weird. I would have expected this to help drop size a bit by reducing the need for offsetting loads, but almost everything is bigger instead.

Any idea where that extra binary size is coming from?

@EFanZh
Copy link
Contributor Author

EFanZh commented Aug 13, 2025

@tgross35: There isn’t any significant change since the last perf run (https://perf.rust-lang.org/compare.html?start=b88076097751f7677b850b94b20faf5679fca321&end=1a76f3df0b6373e760df2514a5af2587f3e01aff&stat=instructions:u). But my local development environment is currently broken, and I’ll need some time to analyze the perf result.

@tgross35
Copy link
Contributor

Oh yeah that the wins definitely outweigh the losses here, no disagreement. I’m just wondering what makes clif such an outlier.

The cargo changes have to be inlining, maybe this just tips the scale for something commonly-called medium-sized function to be inlined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.